Detecting high-quality posts in community question answering sites
نویسندگان
چکیده
Community question answering (CQA) has become a new paradigm for seeking and sharing information. In CQA sites, users can ask and answer questions, and provide feedback (e.g., by voting or commenting) to these questions/answers. In this article, we propose the early detection of high-quality CQA questions/answers. Such detection can help discover a high-impact question that would be widely recognized by the users in these CQA sites, as well as identify a useful answer that would gain much positive feedback from site users. In particular, we view the post quality from the perspective of the voting outcome. First, our key intuition is that the voting score of an answer is strongly positively correlated with that of its question, and we verify such correlation in two real CQA data sets. Second, armed with the verified correlation, we propose a family of algorithms to jointly detecting the high-quality questions and answers soon after they are posted in the CQA sites. We conduct extensive experimental evaluations to demonstrate the effectiveness and efficiency of our approaches. Overall, our algorithms can outperform the best competitor in prediction performance, while enjoying linear scalability with respect to the total number of posts. 2015 Elsevier Inc. All rights reserved.
منابع مشابه
Retrieving Rising Stars in Focused Community Question-Answering
In Community Question Answering (CQA) forums, there is typically a small fraction of users who provide high-quality posts and earn a very high reputation status from the community. These top contributors are critical to the community since they drive the development of the site and attract traffic from Internet users. Identifying these individuals could be highly valuable, but this is not an ea...
متن کاملFinding High-quality Content in Social Media with an Application to Community-based Question Answering
The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability of such content increases, the task of identifying high-quality content in sites based on user contributions—social media sites—becomes increasingly important. Social media in general exhibit a rich variety of information sources: in addition to the content itself, there is a wide arra...
متن کاملCommunity-Based Question Answering via Asymmetric Multi-Faceted Ranking Network Learning
Nowadays the community-based question answering (CQA) sites become the popular Internet-based web service, which have accumulated millions of questions and their posted answers over time. Thus, question answering becomes an essential problem in CQA sites, which ranks the high-quality answers to the given question. Currently, most of the existing works study the problem of question answering bas...
متن کاملDetecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules
Programming community-based question-answering (PCQA) websites such as Stack Overflow enable programmers to find working solutions to their questions. Despite detailed posting guidelines, duplicate questions that have been answered are frequently created. To tackle this problem, Stack Overflow provides a mechanism for reputable users to manually mark duplicate questions. This is a laborious eff...
متن کاملAn Unsupervised Approach for Low-Quality Answer Detection in Community Question-Answering
Community Question Answering (CQA) sites such as Yahoo! Answers provide rich knowledge for people to access. However, the quality of answers posted to CQA sites often varies a lot from precise and useful ones to irrelevant and useless ones. Hence, automatic detection of low-quality answers will help the site managers efficiently organize the accumulated knowledge and provide high-quality conten...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Sci.
دوره 302 شماره
صفحات -
تاریخ انتشار 2015